Real-Time Action Detection in Video Surveillance using Sub-Action Descriptor with Multi-CNN
نویسندگان
چکیده
When we say a person is texting, can you tell the person is walking or sitting? Emphatically, no. In order to solve this incomplete representation problem, this paper presents a sub-action descriptor for detailed action detection. The sub-action descriptor consists of three levels: the posture, the locomotion, and the gesture level. The three levels give three sub-action categories for one action to address the representation problem. The proposed action detection model simultaneously localizes and recognizes the actions of multiple individuals in video surveillance using appearance-based temporal features with multi-CNN. The proposed approach achieved a mean average precision (mAP) of 76.6% at the frame-based and 83.5% at the video-based measurement on the new large-scale ICVL video surveillance dataset that the authors introduce and make available to the community with this paper. Extensive experiments on the benchmark KTH dataset demonstrate that the proposed approach achieved better performance, which in turn boosts the action recognition performance over the state-of-the-art. The action detection model can run at around 25 fps on the ICVL and more than 80 fps on the KTH dataset, which is suitable for real-time surveillance applications.
منابع مشابه
Action Change Detection in Video Based on HOG
Background and Objectives: Action recognition, as the processes of labeling an unknown action of a query video, is a challenging problem, due to the event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. A number of solutions proposed to solve action recognition problem. Many of these frameworks suppose that each video sequence includes only one ...
متن کاملWeighted Directional 3D Stationary Wavelet-based Action Classification
The aim of intelligent surveillance is to conceive reliable and efficient systems having the ability to detect moving objects in complicated real world scenes. These systems also, track the detected objects and analyze their actions and activities. Many applications are built on these operations such as advanced robotics and human computer interaction. This paper aims at proposing a framework f...
متن کاملSecond-order Temporal Pooling for Action Recognition
Most successful deep learning models for action recognition generate predictions for short video clips, which are later aggregated into a longer time-frame action descriptor by computing a statistic over these predictions. Zeroth (max) or first order (average) statistic are commonly used. In this paper, we explore the benefits of using second-order statistics. Specifically, we propose a novel e...
متن کاملMulti-scale and real-time non-parametric approach for anomaly detection and localization
In this paper we propose an approach for anomaly detection and localization, in video surveillance applications, based on spatio-temporal features that capture scene dynamic statistics together with appearance. Real-time anomaly detection is performed with an unsupervised approach using a nonparametric modeling, evaluating directly multi-scale local descriptor statistics. A method to update sce...
متن کاملSmart Surveillance System for Face Recognition
Smart surveillance system refers to video level processing techniques for identification of unwanted (terrorist) faces from real time video. Video object segmentation is an important part of real time surveillance system. For any video segmentation algorithm to be suitable in real time, must require less computational load. The work presented here is divided into two main parts: (1) Face Detect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1710.03383 شماره
صفحات -
تاریخ انتشار 2017